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Database is defined as a set of data that is organized and distributed in a 
manner that permits the user to access the data being stored in an easy and 
more convenient manner. However, in the era of big-data the traditional 
methods of data analytics may not be able to manage and process the large 
amount of data. In order to develop an efficient way of handling big-data, 
this work enhances the use of Map-Reduce technique to handle big-data 
distributed on the cloud. This approach was evaluated using Hadoop server 
and applied on Electroencephalogram (EEG) Big-data as a case study. The 
proposed approach showed clear enhancement on managing and processing 
the EEG Big-data with average of 50% reduction on response time. The 
obtained results provide EEG researchers and specialist with an easy and fast 
method of handling the EEG big data. 
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1. INTRODUCTION 

Using Hadoop in Cloud-computing as an environment for this kind of applications is, so efficient 
for-at least- four reasons. These reasons are the following: (a) the highly fault tolerance it has, (b) the 
automated data distributed it performs with balancing of the computation load across different nodes it 
performs, (c) parallel computation property it has and (d) as close as possible the computation location from 
data position property it has that reflects in network overhead of transferring [1], 

Vast developing in technologies (1-huge data collection, 2-powerful multiprocessor computing (dual 
core, quad core) 3-data mining algorithm) will support Data Mining in business application community [2], 
Big data is use to illustrate massive datasets consisting of 4-V definitions: Volume, Velocity, Variety and 
Value (such as electronic medical records, biomedical image & signal and biometrics data) [3], 

Electroencephalogram (EEG) data is a kind of biomedical signal data sets and clinical Big-Data. 
EEG is a test that is use to evaluate and record the electrical activity of the brain. EEG is widely used in the 
diagnosis and analysis of critical diseases. Electrophysiological data is another domain, where Big Data 
implemented and contains approximately 100 multi-channel signals. With records obtained from each patient 
generating at 5 to 10 Gigabytes (GB) data and by utilizing standalone tools such as Markonis [4] were found 
to be ineffective to meet the growing demand of data and needs to update multi center collaborative studies 
with real time and interactive access [5]. In this paper, Hadoop engine will be used to conduct the Map¬ 
Reduce processes, to process EEG Big-data, where the Map coverts the data to list with indexing and 
thereby, makes the comparison operations among values much easier than it used to be before. In the Reduce 
step, the programmer or the developer can choose the data that they are interested in, so that this will 
minimize the amount of data that we have and focus on the data that is of their main interest. 
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2. RESEARCH METHOD 

In order to enhance the efficiency of analyzing EEG Big-Data for more understanding and ease of 
studying patient cases, EEG Big-Data needs to implement along with Hadoop by using Map-Reduce 
technique. The use of Map-Reduce and Hadoop on distributed systems, in the Cloud Computing environment 
can contribute to the significant advance in clinical Big-Data processing and utilization. In addition, this will 
offer new opportunities in the emerging era of Big-Data analysis and enhance the outcome of clinical EEG 
Big-Data analytical tools. 

To use the proposed method (EMRT), we follow the following steps: 

a. Store the datasets (EEG-Text-Files) in identified folder (input folder) that will be the root for 
programmers. 

b. The input folder is located in a network and called H. Work that it considers as an environment path for 
Hadoop folder if it downloads under test environment, which contains the required data for programmer. 
It is possible to change the path through required configuration. 

c. The next important step is to create Database from folder, which it has too, many field that separated by 
columns and every line represents a record. To mn Map-Reduce, which must to have same structure and 
data type that every Database (SQL server. Image, Text file, Oracle etc) must have specific Map-Reduce. 

d. Then converting these records to list, the main feature in the list, it has an index that enable programmer 
to choose interests columns of folder that will organize the data easily. In this paper the first value 
patient's number ID, Age, Time turnoff, signal analysis. 

The Map has constant steps that aim to convert the text to a list. Then, by using the Hadoop 
commands, the different functions will be the reference for these operations is the Map. The functions of the 
Hadoop automatically create the Map, then the procedures entered that is wanted, and thereby obtaining the 
data required. Then, these data are export to the output folder. The output folder need to create as well as the 
input folder before the Java classes run which encapsulated into Jar file. 

In this paper, the four columns of EEG-data are taken (ID, Time turnoff. Age, Signal analysis) the 
first column will be considered as the key and the rest columns are considered as the values. Thus, the 
comparison will be done on the values which be interested so that obtain the key. The key starts from zero in 
the index until it gets to the last value in the record. Through Java commands Read of Line (ROL) and End 
Of File (EOF) the problem of number of lines has been solved by ROL to make looping on list and another 
command for moving from one file to another [6]. The Hadoop automatically creates a folder that has a value 
in it but the programmer needs to provide the name of the input and output folder to the Hadoop. In addition, 
when exporting the Java file, the name of the input and the output folder will have needed for providing. 
Thus, the user sends the folder to the Hadoop, where the programmer sends an empty folder and specifies its 
place (could be on another server). After conducting the Map and Reduce processes, the values will have sent 
automatically to the output folder [7], The programmer needs to specify the path with the ID address and 
other information. Thus, the function of the Map is to convert the data to list and the function of the Reduce 
is optional according the programmer’s mains interest. 

The programmer reads the text file line by line and takes the column in it by choosing the data type, 
where the Hadoop is flexible in conducting these operations. The string differs from the text in that the text is 
larger than the string. The Map takes the text that originated from pictures and converts it to a list and the list 
takes the text as objects, where Map can store any type of data in it. The load collects data that turns the light 
off. It could give extra information on which once turned the light first and which one turned the light last, 
where the sensor contributes to the function of turning off the light. The main class, which is the public class, 
the static class that is branched. Mapper class and Reduces class from Java. 


3. MAPPER FUNCTION 

The EEG files will split by Job-Tracker in the number of blocks and each block will have processed 
by one Task-Tracker. Java has a feature read of line to select the interested column and in the research paper 
has chosen the ID of patient. Age, time turnoff and signal analysis. Consequently, the four columns of 
selecting will be transformed in the indexing list from another side the size of data is compressed that will 
decrease the response time and efficient retrieval will get. 


4. IMPLEMENTAION AND EXPERIMENT EVALUATION 

The first step of building a Map is to determine the type of the file in the form of a text file that can 
read and by moving the cursor to the end of line. Then, the programmer needs to determine the values that 
needed by taking them from the columns, where the columns need to be constant to build Map on it. After 
moving to the Java step, the comparison of time done in the form of a text, where text converted to time 
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automatically. The text will have converted to time and the time will have converted to list, the Reduce deals 
with two texts and take back two other texts. In addition to the Java, files (import files); it made a 
configuration on the imported files inside the Hadoop. The final step is to generate the output file, which has 
all the values that have specified by the user. As shown from Table 1, the EMRT using Map-Reduce 
technique on Hadoop and distributing the big data on cloud. 


Table 1. Comparison Table 


Approach 

Response time (s) 

Accuracy 

Schultz, 2013 

0.71 

80.6% 

Mohammed et al„ 2014 

0.99 

70.1% 

Wang et al., 2014 

1.01 

93.0% 

Markonis et ah. 2015 

0.69 

68.7% 

Proposed 

0.59 

96.5% 


After gaining the required EEG text files, HDFS split these files across multiple of computer (nodes) 
in Cloud-Computing. Then Hadoop runs Map-Reduce model for processing EEG-Big- Data in real time and 
return back the result to user. The features of Hadoop reliable, fault tolerance by cloning the block data on 
three other nodes, scalable, parallel computing and high throughput access for files that make Hadoop so 
efficient for managing large Data sets. 

At the last step, the EEG-Data that contains the filtration values will have sent to output folder that 
resides into HDFS and then will retrieved by client then will be loaded in the client's device. The final form 
of EEG list has ready for taking a decision by medical specialists depending on the result that have been 
analyzed in typical response time, accuracy analysis, and without redundancy as in Table 2. 


Table 2. Experiment Result 


Patient 

ID 

Proposed 
Approach 
Response Time (s) 

Hit 

Miss 

Old Data Structure 
Response Time (s) 

Hit 

Miss 

i 

0.38 

V 


2.4 

V 


3 

0.24 

V 


2.33 

V 


5 

0.35 

V 


2.35 

V 


5 

1.22 

V 


3.19 

V 


7 

0.36 

V 


2.36 


V 

7 

0.41 

V 


2.44 

V 


11 

1.14 

V 


2.62 

V 


12 

0.5 

V 


2.65 


V 

12 

1.03 

V 


2.68 

V 


17 

0.37 

V 


2.72 

V 


19 

0.57 

V 


2.75 


V 

19 

1.5 

V 


2.78 

V 


23 

0.32 


V 

2.81 

V 


23 

0.51 

V 


2.85 

V 


25 

0.32 

V 


2.88 

V 


26 

0.2 

V 


2.91 

V 


30 

0.2 

V 


2.94 

V 


32 

0.48 

V 


2.97 

V 


42 

0.22 

V 


3.01 


V 

42 

1.3 

V 


3.04 

V 


43 

0.47 

V 


3.07 

V 


43 

1.3 

V 


3.1 

V 


45 

0.26 

V 


3.14 

V 


45 

0.34 

V 


3.17 


V 

46 

0.3 

V 


3.2 

V 


46 

0.38 

V 


3.23 

V 


61 

0.2 

V 


3.27 

V 


61 

1 

V 


3.3 

V 


76 

1.29 

V 


3.33 

V 


Average 

0.59 



2.87 



Hit Ratio 


96.55% 



86.20% 
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5. CONCLUSION 

The previous results show clear enhancements on the response time and accuracy of the retrieved 
data, with average overall enhancement of 50%, which reflect on the performance of big-data management 
and process. 

Table 2 shows a comparison of the results between previous used techniques and EMRT, by 
applying the same used dataset. EMRT using Map-Reduce technique on Hadoop and distributing the big data 
on cloud. It showed clear enhanced performance over previous related works, which will definitely reflect on 
the efforts and output of the clinical researchers and experts. Moreover, this enhanced results, will make it 
easy for the global societies to adopt the (IoT) concept. 

a. The use of cloud computing is useful and effective regarding the cost and efforts. 

b. Map-Reduce technique is more efficient in big data management and processing than traditional data 
structure techniques. 

c. Clinical data, especially EEG data should be distributed on the cloud, for more reliability and ease of 
use. 

d. The EEG data management should use Map-Reduce on Hadoop in order to make it easy and efficient for 
researchers and experts to retrieve their reed information, and do their studies. 
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